Loading library

Assignment 1

1. Reading input and the network graph

Analysis: The clusters that I see are based of the following people. First cluster is based on ‘Jamal Zougam’, second cluster is based on ‘Semaan Gaby Eid’ and the third cluster is based on ‘Taysir Alouny’.

2. Adding the highlight of degree 1 and 2

Analysis: ‘Jamal Zougam’ is the individual with the maximum connection in the network, hence this person is the most influential to the network in terms of spreading information. From Wikipedia we know the following - “Zougam (main accused in the Madrid bombing) owned a mobile phone shop in the Lavapiés neighborhood in Madrid. He is believed to be the person who sold telephones which were used to detonate the bombs in the attack. He also reportedly helped construct the bombs and was one of the first to be arrested.”

3. Identification of Clusters

Analysis: The automatic cluster detection found only two clusters, ‘Jamal Zougam’ cluster and other cluster was a lossly based on manually identified clusters (‘Semaan Gaby Eid’ and ‘Taysir Alouny’)

4. Heatmap plot for cluster identification

Analysis: Most profound cluster is based again on ‘Jamal Zougam’ and this was identified in step 1 and 3.

Assignment 2

1. Animated Bubble Chart

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

## Warning: `line.width` does not currently support multiple values.

Analysis: Most countries very little coal and oil when compared to US and China. Over of the years US consumption of oil increases over coal. Despite its demand India consumes way lesser coal and oil compared to China. France Oil consumption reduced from the peak of 1983, over the years they are reducing their dependence on oil and coal, the similar was the trend in Germany, we know this is due to Germany uses Nuclear and Renewable sources. In the year 1998 there was a brief reversal in the consumption of Oil by China however this trend quickly died and the growth in consumption of coal increased.

2. Animated Bubble Chart for selected countries

Analysis: The reversal in the consumption of coal and oil by these countries starts from 1991 for Germany. The reduction in the coal occurred due to merging of East and West Germany. France is a champion of nuclear energy, they switched to almost nuclear only power in the year 1983 and have never looked back since then, this was due to the series of nuclear test that France was pushing for in the height of cold war.

3. Animated chart of fuel consumption in terms of oil consumption

Analysis:

4. Animated chart of fuel consumption in terms of oil consumption with elastic easing

Analysis: Due to elastic easing, large changes are easier seen this is due to the fact the larger the change larger is the oscialltion/jump due to elastic effect.

5. Custom Tour of dataset

## Value 0.880 32.4% better (0.781 away) - NEW BASIS
## Value 0.943 7.9% better (0.394 away) - NEW BASIS
## Value 0.972 3.0% better (0.448 away) - NEW BASIS
## Value 0.985 1.4% better (0.237 away) - NEW BASIS
## Value 0.988 0.3% better (0.096 away) - NEW BASIS
## Value 0.988 0.1% better (0.036 away)
## Value 0.988 0.1% better (0.055 away)
## Value 0.989 0.1% better (0.066 away) - NEW BASIS
## Value 0.989 0.0% better (0.041 away)
## Value 0.989 0.0% better (0.024 away)
## Value 0.989 0.1% better (0.056 away)
## Value 0.989 0.1% better (0.067 away)
## Value 0.989 0.0% better (0.040 away)
## Value 0.989 0.0% better (0.035 away)
## Value 0.989 0.0% better (0.031 away)
## Value 0.989 0.1% better (0.088 away)
## Value 0.989 0.0% better (0.038 away)
## Value 0.989 0.1% better (0.047 away)
## Value 0.989 0.1% better (0.043 away)
## Value 0.990 0.1% better (0.061 away) - NEW BASIS
## Value 0.990 0.0% better (0.039 away)
## Value 0.990 0.0% better (0.026 away)
## Value 0.990 0.0% better (0.034 away)
## Value 0.990 0.1% better (0.077 away)
## Value 0.990 0.0% better (0.051 away)
## Value 0.990 0.0% better (0.024 away)
## Value 0.990 0.0% better (0.037 away)
## Value 0.990 0.0% better (0.030 away)
## Value 0.990 0.0% better (0.022 away)
## Value 0.990 0.0% better (0.016 away)
## Value 0.990 0.0% better (0.031 away)
## Value 0.990 0.0% better (0.055 away)
## Value 0.990 0.1% better (0.071 away)
## Value 0.990 0.0% better (0.086 away)
## Value 0.990 0.1% better (0.149 away)
## Value 0.990 0.0% better (0.024 away)
## Value 0.990 0.0% better (0.019 away)
## Value 0.990 0.0% better (0.025 away)
## Value 0.990 0.0% better (0.058 away)
## Value 0.990 0.0% better (0.020 away)
## Value 0.990 0.0% better (0.035 away)
## Value 0.990 0.0% better (0.051 away)
## Value 0.990 0.0% better (0.025 away)
## Value 0.990 0.0% better (0.017 away)
## No better bases found after 25 tries.  Giving up.
## Final projection: 
## 0.740  -0.058  
## -0.041  0.845  
## 0.257  -0.262  
## -0.091  0.046  
## -0.024  -0.022  
## -0.502  -0.284  
## -0.212  0.277  
## -0.281  -0.230  
## Value 0.862 29.7% better (0.781 away) - NEW BASIS
## Value 0.921 7.9% better (0.485 away) - NEW BASIS
## Value 0.969 5.2% better (0.715 away) - NEW BASIS
## Value 0.980 1.1% better (0.171 away) - NEW BASIS
## Value 0.980 0.1% better (0.046 away) - NEW BASIS
## Value 0.982 0.3% better (0.083 away) - NEW BASIS
## Value 0.982 0.0% better (0.024 away)
## Value 0.982 0.0% better (0.032 away)
## Value 0.982 0.1% better (0.092 away)
## Value 0.982 0.1% better (0.052 away)
## Value 0.982 0.0% better (0.020 away)
## Value 0.982 0.0% better (0.031 away)
## Value 0.982 0.0% better (0.018 away)
## Value 0.982 0.1% better (0.039 away)
## Value 0.981 0.0% better (0.006 away)
## Value 0.982 0.0% better (0.031 away)
## Value 0.982 0.0% better (0.048 away)
## Value 0.982 0.0% better (0.052 away)
## Value 0.982 0.1% better (0.070 away)
## Value 0.983 0.1% better (0.100 away) - NEW BASIS
## Value 0.983 0.1% better (0.057 away)
## Value 0.983 0.0% better (0.030 away)
## Value 0.984 0.1% better (0.122 away) - NEW BASIS
## Value 0.984 0.0% better (0.041 away)
## Value 0.985 0.1% better (0.113 away) - NEW BASIS
## Value 0.985 0.0% better (0.029 away)
## Value 0.985 0.0% better (0.023 away)
## Value 0.985 0.0% better (0.042 away)
## Value 0.985 0.0% better (0.033 away)
## Value 0.985 0.0% better (0.034 away)
## Value 0.985 0.0% better (0.092 away)
## Value 0.985 0.0% better (0.041 away)
## Value 0.985 0.0% better (0.039 away)
## Value 0.985 0.0% better (0.045 away)
## Value 0.985 0.0% better (0.027 away)
## Value 0.985 0.0% better (0.022 away)
## Value 0.986 0.1% better (0.106 away)
## Value 0.985 0.0% better (0.027 away)
## Value 0.985 0.0% better (0.022 away)
## Value 0.986 0.1% better (0.097 away)
## Value 0.985 0.0% better (0.034 away)
## Value 0.985 0.0% better (0.029 away)
## Value 0.985 0.0% better (0.024 away)
## Value 0.985 0.0% better (0.046 away)
## Value 0.985 0.0% better (0.011 away)
## Value 0.985 0.0% better (0.026 away)
## Value 0.985 0.0% better (0.045 away)
## Value 0.986 0.1% better (0.108 away)
## Value 0.985 0.0% better (0.029 away)
## No better bases found after 25 tries.  Giving up.
## Final projection: 
## 0.727  -0.203  
## -0.280  0.663  
## 0.277  0.425  
## -0.155  -0.406  
## -0.454  -0.354  
## 0.012  0.174  
## -0.189  0.122  
## -0.226  -0.056  
## Value 0.948 42.7% better (0.710 away) - NEW BASIS
## Value 0.956 0.9% better (0.147 away) - NEW BASIS
## Value 0.964 0.8% better (0.219 away) - NEW BASIS
## Value 0.978 1.4% better (0.185 away) - NEW BASIS
## Value 0.985 0.8% better (0.165 away) - NEW BASIS
## Value 0.988 0.3% better (0.107 away) - NEW BASIS
## Value 0.989 0.2% better (0.079 away) - NEW BASIS
## Value 0.990 0.0% better (0.030 away)
## Value 0.990 0.1% better (0.099 away) - NEW BASIS
## Value 0.991 0.1% better (0.038 away)
## Value 0.991 0.0% better (0.041 away)
## Value 0.991 0.0% better (0.045 away)
## Value 0.990 0.0% better (0.029 away)
## Value 0.990 0.0% better (0.013 away)
## Value 0.991 0.1% better (0.054 away)
## Value 0.991 0.0% better (0.038 away)
## Value 0.991 0.0% better (0.030 away)
## Value 0.990 0.0% better (0.019 away)
## Value 0.991 0.0% better (0.036 away)
## Value 0.991 0.1% better (0.060 away)
## Value 0.991 0.0% better (0.033 away)
## Value 0.991 0.0% better (0.039 away)
## Value 0.990 0.0% better (0.019 away)
## Value 0.991 0.0% better (0.038 away)
## Value 0.991 0.1% better (0.047 away)
## Value 0.991 0.1% better (0.048 away)
## Value 0.991 0.1% better (0.039 away)
## Value 0.991 0.0% better (0.033 away)
## Value 0.991 0.0% better (0.028 away)
## Value 0.991 0.1% better (0.057 away)
## Value 0.991 0.0% better (0.039 away)
## Value 0.991 0.0% better (0.038 away)
## Value 0.991 0.0% better (0.060 away)
## No better bases found after 25 tries.  Giving up.
## Final projection: 
## 0.812  -0.122  
## -0.080  0.859  
## 0.118  0.186  
## -0.191  -0.172  
## -0.226  -0.063  
## -0.149  -0.405  
## 0.061  0.101  
## -0.456  -0.072  
## Value 0.728 9.6% better (0.781 away) - NEW BASIS
## Value 0.971 33.6% better (0.781 away) - NEW BASIS
## Value 0.979 1.2% better (0.133 away) - NEW BASIS
## Value 0.991 1.3% better (0.277 away) - NEW BASIS
## Value 0.991 0.1% better (0.039 away)
## Value 0.991 0.1% better (0.060 away) - NEW BASIS
## Value 0.992 0.1% better (0.054 away) - NEW BASIS
## Value 0.994 0.2% better (0.079 away) - NEW BASIS
## Value 0.996 0.2% better (0.107 away) - NEW BASIS
## Value 0.996 0.0% better (0.037 away)
## Value 0.996 0.0% better (0.018 away)
## Value 0.996 0.1% better (0.057 away)
## Value 0.996 0.0% better (0.032 away)
## Value 0.996 0.1% better (0.045 away)
## Value 0.996 0.0% better (0.013 away)
## Value 0.996 0.1% better (0.040 away)
## Value 0.996 0.1% better (0.032 away)
## Value 0.996 0.1% better (0.063 away)
## Value 0.996 0.0% better (0.045 away)
## Value 0.996 0.0% better (0.030 away)
## Value 0.996 0.1% better (0.084 away)
## Value 0.996 0.1% better (0.078 away)
## Value 0.996 0.1% better (0.065 away)
## Value 0.996 0.0% better (0.034 away)
## Value 0.996 0.0% better (0.029 away)
## Value 0.996 0.1% better (0.040 away)
## Value 0.996 0.0% better (0.031 away)
## Value 0.996 0.0% better (0.034 away)
## Value 0.996 0.0% better (0.074 away)
## Value 0.996 0.0% better (0.029 away)
## Value 0.996 0.1% better (0.043 away)
## Value 0.996 0.0% better (0.080 away)
## Value 0.996 0.1% better (0.048 away)
## No better bases found after 25 tries.  Giving up.
## Final projection: 
## 0.720  0.061  
## -0.107  0.841  
## 0.010  0.111  
## -0.316  0.009  
## -0.233  -0.393  
## 0.051  -0.297  
## 0.317  -0.173  
## -0.461  -0.057  
## Value 0.840 26.4% better (0.781 away) - NEW BASIS

Analysis: Yes clusters are depended on the year range, this is due to the fact that many european countries moved away from coal as a group and also US and India strated growing in similar years. China and Brazil have the largest effect on the projection.

Apendix

knitr::opts_chunk$set(echo = FALSE)
library(dplyr)
library(tidyr)
library(plotly)
library(visNetwork)
library(igraph)
library(seriation)
library(tourr)

set.seed(42)

edges <- read.delim("trainData.dat", header = FALSE, sep  = " ")
nodes <- read.delim("trainMeta.dat", header = FALSE, sep = " ")


nodes$id <- rownames(nodes)
colnames(nodes) <- c("label", "group", "id")
colnames(edges) <- c("temp", "from", "to", "value")
edges$temp <- NULL
graph <- graph.data.frame(edges, directed = T)
degree_value <- degree(graph)
nodes$value <- degree_value[match(nodes$id, names(degree_value))]
nodes <- na.omit(nodes) # removing non connected nodes

visNetwork(nodes = nodes, edges = edges, main = "Network of people invloved in Madrid Bombing") %>% 
  visGroups(groupname = "0", color = "blue") %>% 
  visGroups(groupname = "1", color = "red") %>% 
  visEdges(arrows = "to") %>%
  visOptions(highlightNearest = list(enabled =TRUE, 
                                     algorithm = "hierarchical", degree = 1), 
             collapse = TRUE,
             selectedBy = "group",
             nodesIdSelection = TRUE) %>%
  visLayout(randomSeed = 42) %>%
  visPhysics(solver= "repulsion") %>% 
  visLegend() %>% addFontAwesome()

visNetwork(nodes = nodes, edges = edges, main = "Network of people invloved in Madrid Bombing") %>% 
  visGroups(groupname = "0", color = "blue") %>% 
  visGroups(groupname = "1", color = "red") %>% 
  visEdges(arrows = "to") %>%
  visOptions(highlightNearest = list(enabled =TRUE,  algorithm = "hierarchical",
                                     degree = list(from = 1, to = 2)), 
             collapse = TRUE,
             selectedBy = "group",
             nodesIdSelection = TRUE) %>%
  visLayout(randomSeed = 42) %>%
  visPhysics(solver= "repulsion") %>% 
  visLegend() %>% addFontAwesome()
graph_for_clusters <- graph.data.frame(edges, directed = FALSE)
clusters <- cluster_edge_betweenness(graph_for_clusters, directed = T)
nodes$clusters <- clusters$membership

visNetwork(nodes = nodes, edges = edges, main = "Network of people invloved in Madrid Bombing") %>% 
  visEdges(arrows = "to") %>%
  visOptions(highlightNearest = list(enabled =TRUE,  algorithm = "hierarchical",
                                     degree = list(from = 1, to = 2)), 
             collapse = TRUE,
             selectedBy = "group",
             nodesIdSelection = TRUE) %>%
  visLayout(randomSeed = 42) %>%
  visPhysics(solver= "repulsion") %>% 
  visLegend() %>% addFontAwesome() %>%visIgraphLayout()
clusters <- cluster_edge_betweenness(graph_for_clusters)
nodes$clusters <- clusters$membership
netm <- get.adjacency(graph_for_clusters, sparse=F)
colnames(netm) <- nodes$label
rownames(netm) <- nodes$label
rowdist<-dist(netm)

order1<-seriate(rowdist, "HC")
ord1<-get_order(order1)
reordmatr<-netm[ord1,ord1]

plot_ly(z=~reordmatr, x=~colnames(reordmatr), 
        y=~rownames(reordmatr), type="heatmap") %>% layout(title = "Heatmap to find clusters among the bombing suspects")
oilcoal_data <- read.csv2("Oilcoal.csv", header = TRUE, sep = ";")

oilcoal_data <- oilcoal_data[,c("Country", "Year", "Coal", "Oil", "Marker.size")]

oilcoal_data %>% plot_ly(x=~Coal, y=~Oil, frame =~Year, type = 'scatter', text = ~Country, mode = 'markers', size= ~Marker.size) %>% animation_opts(100, easing = "cubic", redraw = F) %>% layout(title="Timeline of Consumption of Oil vs. Coal by Country")
oilcoal_data %>% filter(Country %in% c("France", "Germany")) %>% plot_ly(x=~Coal, y=~Oil, frame =~Year, type = 'scatter', text = ~Country, mode = 'markers') %>% animation_opts(100, easing = "cubic", redraw = F) %>% layout(title="Timeline of Consumption of Oil vs. Coal by Country")
temp1 <- oilcoal_data %>% group_by(Year, Country) %>% mutate(oil_p = Oil/sum(Oil, Coal))
temp2 <- oilcoal_data %>% group_by(Year, Country) %>% mutate(oil_p = 0)
temp3 <- rbind(temp1, temp2)
temp3 %>% plot_ly(x=~Country, y=~oil_p, frame =~Year, type = 'scatter', mode='line', text = ~Country) %>% animation_opts(100, easing = "cubic", redraw = F) %>% layout(title="Timeline of energy in terms of oil consumption")

temp3 %>% plot_ly(x=~Country, y=~oil_p, frame =~Year, type = 'scatter', mode='line', text = ~Country) %>% animation_opts(100, easing = "elastic", redraw = F) %>% layout(title="Timeline of energy in terms of oil consumption")


oilcoal <- read.csv2("Oilcoal.csv", header = TRUE, sep = ";")
oilcoal$X <- NULL
set.seed(42)

oilcoal_tour <- oilcoal[, c("Country", "Year", "Coal")]
oilcoal_tour$Coal <- as.numeric(gsub(",", ".", oilcoal_tour$Coal))
oilcoal_tour <- oilcoal_tour %>%spread(Country, Coal)
oilcoal_scale <- rescale(oilcoal_tour[, 2:9])

rownames(oilcoal_scale) <- oilcoal_tour$Year
colnames(oilcoal_scale) <- names(oilcoal_tour)[-1]


tour <- new_tour(oilcoal_scale, guided_tour(cmass), NULL)
steps <- c(0, rep(1/15, 200))

Projs<-lapply(steps, function(step_size){  
  step <- tour(step_size)
  if(is.null(step)) {
    .GlobalEnv$tour<- new_tour(oilcoal_scale, guided_tour(cmass), NULL)
    step <- tour(step_size)
  }
  step
}
)

# projection of each observation
tour_dat <- function(i) {
  step <- Projs[[i]]
  proj <- center(oilcoal_scale %*% step$proj)
  data.frame(x = proj[,1], y = proj[,2], state = rownames(oilcoal_scale))
}
# projection of each variable's axis
proj_dat <- function(i) {
  step <- Projs[[i]]
  data.frame(
    x = step$proj[,1], y = step$proj[,2], variable = colnames(oilcoal_scale)
  )
}
stepz <- cumsum(steps)
# tidy version of tour data
tour_dats <- lapply(1:length(steps), tour_dat)
tour_datz <- Map(function(x, y) cbind(x, step = y), tour_dats, stepz)
tour_dat <- dplyr::bind_rows(tour_datz)
# tidy version of tour projection data
proj_dats <- lapply(1:length(steps), proj_dat)
proj_datz <- Map(function(x, y) cbind(x, step = y), proj_dats, stepz)
proj_dat <- dplyr::bind_rows(proj_datz)
ax <- list(
  title = "", showticklabels = FALSE,
  zeroline = FALSE, showgrid = FALSE,
  range = c(-1.1, 1.1)
)
# for nicely formatted slider labels
options(digits = 3)
tour_dat <- highlight_key(tour_dat, ~state, group = "A")
tour <- proj_dat %>%
  plot_ly(x = ~x, y = ~y, frame = ~step, color = I("black")) %>%
  add_segments(xend = 0, yend = 0, color = I("gray80")) %>%
  add_text(text = ~variable) %>%
  add_markers(data = tour_dat, text = ~state, ids = ~state, hoverinfo = "text") %>%
  layout(xaxis = ax, yaxis = ax, title = "Animated tour of the coal consumption by country")#%>%animation_opts(frame=0, transition=0, redraw = F)
tour